Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find variables and their paths in a Crunch dataset or variable folder #640

Closed
wants to merge 1 commit into from

Conversation

sluga
Copy link

@sluga sluga commented Apr 12, 2024

WIP for now, because I'm not sure if this belongs here. Currently it returns a data.frame because I'm not sure if existing Crunch abstractions are a great fit this function as it'd be very useful if it not only finds the variables but collects their paths as well.

Copy link
Contributor

@gergness gergness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I struggle with how much this should "fit in" to the existing stuff or not. I think the way that would best fit into existing conventions would be to add a allVariables() method for a folder that returns a VariablesCatalog, but then as you say there wouldn't be a great way to get the full paths back out. We could do something like cache the full folder path inside the pseudo-variables catalog that we return and then add a function to get the full path for a variable that optionally uses that cached path, but that seems hack-ish enough that we might as well make it how you suggest 🤷

Two process things:

  • The build failure is because you need to run make doc to add this new file to the Collate field in the DESCRIPTION (just a wrapper for devtools::document() if you prefer)
  • Let's do this through the full Jenkins build system, which requires your git branch name to end with a pivotal ticket number (egfolder-variables-recursive-12345)

#'
#' @return Data.frame with one row per Crunch variable and columns \code{alias}, \code{path}, \code{hidden}
#' @export
findVariables <- function(x, deep = FALSE, include.hidden = FALSE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer dropping the include.hidden param and if you ever wanted it, you'd have to explicitly do rbind(findVariables(ds), findVariables(hiddenFolder(ds)), and make this clear in documentation.

(otherwise needs an include.private option as well)

The API & documentation are kind of confusing about this, because it changed since I joined, but hidden variables are always in the hidden folder (before 2021, there could be hidden variables in any folder), so I think I'd prefer having the only indication in the output that it is hidden be the folder name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the case for not specifically dealing with hidden & private variables here, though for me the convenience of having include.hidden and include.private wins out. So the current version has them, but if you feel strongly about this, I'll drop them.

R/findVariables.R Show resolved Hide resolved
@sluga
Copy link
Author

sluga commented Apr 18, 2024

#641

@sluga sluga closed this Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants